Back

Informatics in Medicine Unlocked

Elsevier BV

Preprints posted in the last 90 days, ranked by how well they match Informatics in Medicine Unlocked's content profile, based on 11 papers previously published here. The average preprint has a 0.09% match score for this journal, so anything above that is already an above-average fit.

1
A Tabular Residual Neural Network for Diabetes Classification and Prediction

Hammond, A.; Afridi, M.; Balakrishna, K.

2025-12-29 endocrinology 10.64898/2025.12.29.25343132
Top 0.1%
100× avg
Show abstract

Diabetes Mellitus (DM) is a metabolic disorder characterized by hyperglycemia, with type 1 characterized as an autoimmune destruction of pancreatic beta cells and type 2 characterized by insulin resistance with progressive beta cell dysfunction. This study applied an existing binary classification algorithm (ALTARN) to accurately predict DM. ALTARN, as a tabular attention residual neural network, uses residual connection to find complex patterns present in tabular columns. We achieved an average training accuracy of 75.22%. Furthermore, a robust set of validation metrics was obtained via five-fold stratified cross-validation, yielding an average accuracy of 74.61%, an average precision of 72.36%, a mean recall of 79.69%, and a mean F1 score of 75.83%.

2
An Exploratory Study of ResNet and Capsule Neural Networks for Brain Tumor Detection in MRI

Mensah, S.; Atsu, E. K. A.; Ammah, P. N. T.

2026-02-09 radiology and imaging 10.64898/2026.02.05.26345460
Top 0.3%
52× avg
Show abstract

Brain tumors are one of the most life-threatening diseases, requiring precise and timely detection for effective treatment. Traditional methods for brain tumor detection rely heavily on manual analysis of MRI scans, which is time-consuming, subjective, and prone to human error. With advancements in deep learning, Convolutional Neural Networks (CNNs) have become popular for medical image analysis. However, CNNs are limited in their ability to capture spatial hierarchies and pose variations, which reduces their accuracy, particularly for tasks like brain tumor segmentation where precise spatial relationships are crucial. This research introduces a hybrid Capsule Neural Network (CapsNet) and ResNet50 model designed to overcome the limitations of traditional CNNs by capturing both spatial and pose information in MRI scans. The proposed model leverages ResNet50 for feature extraction and CapsNet for handling spatial relationships, leading to more accurate segmentation. The study evaluates the model on the BraTS2020 dataset and compares its performance to state-of-the-art CNN architectures, including U-Net and pure CNN models. The hybrid model, featuring a custom 5-cycle dynamic routing algorithm to enhance capsule agreement for tumor boundaries, achieved 98% accuracy and an F1-score of 0.87, demonstrating superior performance in detecting and segmenting brain tumors. This study pioneers the systematic evaluation of the ResNet50 + CapsNet hybrid on the BraTS2020 dataset, with a tailored class weighting scheme addressing class imbalance, improving effectiveness in identifying irregularly shaped tumors and smaller regions in identifying irregularly shaped tumors and smaller tumor regions. The study offers a robust solution for automating brain tumor detection. Future work will explore the use of Capsule Networks alone for brain tumor detection in MRI data and investigate alternative Capsule Network architectures, as well as their integration into clinical decision support systems.

3
Automated Burn Detection from Images Using Deep Learning Models: The Role of AI in the Triage of Burn Injuries

Durgude, A.; Soni, N.; Raghuwanshi, K. C.; Awasthi, S.; Uniyal, K.; Yadav, S.; Kakani, A.; Kesharwani, P.; Mago, V.; Vathulaya, M.; Rao, N.; Chattopadhyay, D.; Kapoor, A.; Bhimsaria, D.

2025-12-31 health informatics 10.64898/2025.12.24.25337638
Top 0.3%
51× avg
Show abstract

Burn injuries are a significant concern in developing countries due to limited infrastructure, and treating them remains a major challenge. The manual assessment of burn severity is subjective and depends, to a large extent, on individual expertise. Artificial intelligence can automate this task with greater accuracy and improved predictions, which can assist healthcare professionals in making more informed decisions while triaging burn injuries. This study established a model pipeline for detecting burn injuries in images using multiple deep learning models, including U-Net, DenseNet, ResNet, VGG, EfficientNet, and transfer learning with the Segment Anything Model2 (SAM2). The problem statement was divided into two stages: 1) removing the background and 2) burn skin segmentation. ResNet50, used as an encoder with a U-Net decoder, performs better for the background removal task, achieving an accuracy of 0.9757 and an intersection over union (Jaccard index) of 0.9480. DenseNet169, used as an encoder with a U-Net decoder, performs well in burn skin segmentation, achieving an accuracy of 0.9662 and an intersection over Union of 0.8504. The dataset collected during the project is available for download to facilitate further research and advancements (Link to dataset: https://geninfo.iitr.ac.in/projects). TBSA was estimated from predicted burn masks using scale-based calibration

4
Thyroid Cancer Risk Prediction from Multimodal Datasets Using Large Language Model

Ray, P.

2026-03-06 health informatics 10.64898/2026.03.05.26347766
Top 0.3%
49× avg
Show abstract

Thyroid carcinoma is one of the most prevalent endocrine malignancies worldwide, and accurate preoperative differentiation between benign and malignant thyroid nodules remains clinically challenging. Diagnostic methods that medical practitioners use at present depend on their personal judgment to evaluate both imaging results and separate clinical tests, which creates inconsistency that leads to incorrect medical evaluations. The combination of radiological imaging with clinical information systems enables healthcare providers to enhance their capacity to make reliable predictions about patient outcomes while improving their decision-making abilities. The study introduces a deep learning framework that utilizes multiple data sources by combining magnetic resonance imaging (MRI) data with clinical text to predict thyroid cancer. The system uses a Vision Transformer (ViT) to obtain advanced MRI scan features, while a domain-adapted language model processes clinical documents that contain patient medical history and symptoms and laboratory results. The cross-modal attention system enables the system to merge imaging data with textual information from different sources, which helps to identify how the two types of data are interconnected. The system uses a classification layer to classify the fused features, which allows it to determine the probability of cancerous tumors. The experimental results show that the proposed multimodal system achieves better results than the unimodal base systems because it has higher accuracy, sensitivity, specificity, and AUC values, which help medical personnel to make better preoperative decisions.

5
Transfer Learning for Medical Imaging: An Empirical Evaluation of CNN Architectures on Chest Radiographs

Salve, H. S.

2026-01-08 radiology and imaging 10.64898/2026.01.07.26343591
Top 0.4%
35× avg
Show abstract

This paper presents a comprehensive comparative study of five state-of-the-art CNN architectures, VGG19, ResNet50, InceptionV3, DenseNet121, and EfficientNetB0 for multi-class classification of Chest X-ray images (CXR) into four categories: Edema, Normal, Pneumonia, and Tuberculosis (TB). The models were trained, validated, and tested on a dataset comprising 6,092 training and 325 testing images across four distinct classes. Each architecture was initialized with ImageNet weights, augmented with a custom classifier, and fine-tuned under identical conditions to ensure a fair comparison. The models are evaluated on a comprehensive set of metrics, including accuracy, per-class recall, training time, and model complexity. Experimental results indicate that VGG19 achieved the highest classification accuracy of 98.15%, followed closely by ResNet50 at 97.54%. This study provides empirical evidence to guide the selection of appropriate deep learning models for chest X-ray diagnosis, balancing performance with operational constraints

6
Diagnostic accuracy of two-photon fluorescence microscopy in the Mohs micrographic surgical margins of squamous cell carcinoma

Huang, C. Z.; Ching-Roa, V. D.; Heckman, C. M.; Mould, K.; Sipprell, W. H.; Smoller, B. R.; Ibrahim, S. F.; Giacomelli, M. G.

2026-02-24 dermatology 10.64898/2026.02.21.26346787
Top 0.6%
30× avg
Show abstract

Cutaneous squamous cell carcinoma (SCC) can be time-consuming to treat with Mohs micrographic surgery (MMS) due to the need for intraoperative frozen section (FS) preparation. Two-photon fluorescence microscopy (TPFM) can generate H&E-equivalent images from fresh tissue specimens in a fraction of this time. To determine the accuracy of TPFM for the evaluation of squamous cell carcinoma in MMS margins compared to conventional FS Mohs slide preparation. TPFM was used to image 144 first stage MMS margins from patients being treated for SCC. A Mohs surgeon reviewed 44 training images and then evaluated 100 margins. After a delay, the same surgeon evaluated the corresponding FS slides. Pairs of TPFM and FS slides were reviewed by an expert dermatopathologist to form a consensus diagnosis. Agreement with consensus diagnosis as assessed by an independent dermatopathologist. 3 margins (3%) unequivocally disagreed with the consensus on TPFM and 2 margins (2%) disagreed on FS. The sensitivity and specificity of TPFM were 95.1% and 98.2%, respectively. This study demonstrates that slide-free histology can be interpreted equivalently to conventional Mohs slide processing by both MMS surgeons and dermatopathologists with minimal training.

7
A Mobile AI-enhanced Platform for Standardized Wound Assessment and Clinical Decision Support

Abdolahnejad, M.; Mashayekhi, N.; Kyeremeh, M.; Smith, J.; Chan, M.; Fang, G.; Jegatheeswaran, T.; Chan, H. O.; Joshi, R.; Hong, C.

2026-01-23 dermatology 10.64898/2026.01.22.26344407
Top 0.6%
29× avg
Show abstract

Chronic wounds affect over 1.2 million Canadians and incur healthcare costs exceeding $13 billion annually, with global expenditures approaching $149 billion. Current clinical practice relies on manual measurements and subjective visual evaluations, which overestimate wound area by up to 40% and demonstrate poor-to-moderate inter-rater reliability. This variability complicates longitudinal monitoring and evidence-based treatment selection. We developed and evaluated an integrated mobile platform combining deep learning-based wound assessment with clinical decision support. A curated dataset of 1,648 de-identified clinical wound photographs was assembled from wound care clinics, representing diverse aetiologies (arterial, venous, diabetic foot ulcers, pressure injuries) and skin tones (32% Monk Skin Tone 7-10). Three convolutional neural networks were trained: (1) an EfficientNet-B7-based classifier for wound etiology, (2) a gated pressure injury staging network, and (3) a DeepLabv3 encoder-decoder architecture with ResNet backbone for multi-class tissue segmentation (epithelialization, granulation, slough, eschar). Fiducial marker-based calibration enabled automated wound size quantification. A rule-based recommendation engine mapped assessment outputs to evidence-based dressing selections. The system was deployed as a cross-platform mobile application with cloud-native backend infrastructure. The wound classification model achieved 91.75% mean accuracy across four wound categories. Pressure injury staging accuracy ranged from 67% (Stage III) to 92% (Stage I). Tissue segmentation yielded a mean Dice similarity coefficient of 0.64 {+/-} 0.06 and pixel-level accuracy of 98%. Automated size estimation demonstrated strong correlation with manual measurements (r = 0.73, n=53), with mean absolute error of 3.7 {+/-} 2.1 mm; 84.2% of measurements fell within the {+/-}5 mm clinical equivalence margin. Fiducial marker detection succeeded in 93% of test images. Performance remained stable across skin tone categories and imaging conditions. This integrated platform demonstrates technical feasibility for standardized, objective wound assessment addressing documented limitations of manual practices. The system provides interpretable segmentation overlays and actionable treatment recommendations while maintaining clinician oversight. These findings support progression to prospective validation studies evaluating real-world clinical utility and patient outcomes.

8
UCSF RMaC: University of California San Francisco 3D Multi-Phase Renal Mass CT Dataset with Tumor Segmentations

Sahin, S.; Diaz, E.; Rajagopal, A.; Abtahi, M.; Jones, S.; Dai, Q.; Kramer, S.; Wang, Z.; Larson, P. E. Z.

2026-02-12 radiology and imaging 10.64898/2026.02.11.26346096
Top 0.7%
27× avg
Show abstract

Current standard of care imaging practices cannot reliably differentiate among certain renal tumors such as benign oncocytoma and clear cell renal cell carcinoma (RCC), and between low and high grade RCCs. Previous work has explored using deep learning, radiomics, and texture analysis to predict renal tumor subtypes and differentiate between low and high grade RCCs with mixed success. To further this work, large diverse datasets are needed to improve model performance and provide strong evaluation sets. In this work, a dataset of 831 multi-phase 3D CT exams was curated. Each exam contains up to three contrast-enhanced CT phases. Tumor outlines or bounding boxes were annotated and registered to the image volumes. The pathology results for each tumor and relevant patient metadata are also included.

9
Quality versus quantity of training datasets for artificial intelligence-based whole liver segmentation

Castelo, A.; O'Connor, C.; Gupta, A. C.; Anderson, B. M.; Woodland, M.; Altaie, M.; Koay, E. J.; Odisio, B. C.; Tang, T. T.; Brock, K. K.

2026-02-18 radiology and imaging 10.64898/2026.02.17.26346486
Top 0.7%
27× avg
Show abstract

Artificial intelligence (AI) based segmentation has many medical applications but limited curated datasets challenge model training; this study compares the impact of dataset annotation quality and quantity on whole liver AI segmentation performance. We obtained 3,089 abdominal computed tomography scans with whole-liver contours from MD Anderson Cancer Center (MDA) and a MICCAI challenge. A total of 249 scans were withheld for testing of which 30, MICCAI challenge data, were reserved for external validation. The remaining scans were divided into mixed-curation and highly-curated groups, randomly sampled into sub-datasets of various sizes, and used to train 3D nnU-Net segmentation models. Dice similarity coefficients (DSC), surface DSC with 2mm margins (SD 2mm), the 95th percentile of Hausdorff distance (HD95), and 2D axial slice DSC (Slice DSC) were used to evaluate model performance. The highly curated, 244-scan model (DSC=0.971, SD 2mm=0.958, HD95=2.98mm) performed insignificantly different on 3D evaluation metrics to the mixed-curation 2,840-scan model (DSC=0.971 [p>.999], SD 2mm=0.958 [p>.999], HD95=2.87mm [p>.999]). The 710-scan mixed-curation (Slice DSC=0.929) significantly outperformed the highly curated, 244-scan model (Slice DSC=0.923 [p=0.012]) on the 30 external scans. Highly curated datasets yielded equivalent performance to datasets that were a full order of magnitude larger. The benefits of larger, mixed-curation datasets are evidenced in model generalizability metrics and local improvements. In conclusion, tradeoffs between dataset quality and quantity for model training are nuanced and goal dependent.

10
AWS Trainium vs NVIDIA CUDA for Medical Image Classification: A Comprehensive Benchmark on ChestX-ray14

Fisher, G. R.

2025-12-30 radiology and imaging 10.64898/2025.12.23.25342933
Top 0.7%
27× avg
Show abstract

We present a rigorous benchmark comparing AWS Trainium (trn1 instances) and NVIDIA CUDA (g5 instances with A10G GPUs) for training convolutional neural networks on medical image classification. Using the NIH ChestX-ray14 dataset with 112,120 chest radiographs and 14 thoracic disease labels, we evaluate ResNet-50 and ConvNeXt architectures across both platforms. Our key findings are threefold: (1) Trainium achieves virtually identical accuracy to CUDA for compatible architectures (ConvNeXt-Pico: F1=0.8007 vs 0.8027, {Delta}=0.25%), (2) modern CNN architectures using depthwise convolutions and LayerNorm (ConvNeXt-Tiny and larger) fail to compile or load on Trainium due to hardware constraints, and (3) Trainium is 3-5 x more expensive than CUDA for CNN training even with correct instance sizing. We document the substantial porting effort required, including four critical XLA-specific code modifications, and provide guidance for practitioners considering Trainium for computer vision workloads.

11
A Global Atlas of Digital Dermatology to Map Innovation and Disparities

Groger, F.; Lionetti, S.; Gottfrois, P.; Gonzalez-Jimenez, A.; Groh, M.; Habermacher, L.; Labelling Consortium, ; Amruthalingam, L.; Pouly, M.; Navarini, A.

2025-12-29 dermatology 10.64898/2025.12.27.25342585
Top 0.9%
22× avg
Show abstract

The adoption of artificial intelligence in dermatology promises democratized access to healthcare, but model reliability depends on the quality and comprehensiveness of the data fueling these models. Despite rapid growth in publicly available dermatology images, the field lacks quantitative key performance indicators to measure whether new datasets expand clinical coverage or merely replicate what is already known. Here we present SkinMap, a multi-modal framework for the first comprehensive audit of the fields entire data basis. We unify the publicly available dermatology datasets into a single, queryable semantic atlas comprising more than 1.1 million images of skin conditions and quantify (i) informational novelty over time, (ii) dataset redundancy, and (iii) representation gaps across demographics and diagnoses. Despite exponential growth in dataset sizes, informational novelty across time has somewhat plateaued: Some clusters, such as common neoplasms on fair skin, are densely populated, while underrepresented skin types and many rare diseases remain unaddressed. We further identify structural gaps in coverage: Darker skin tones (Fitzpatrick V-VI) constitute only 5.8% of images and pediatric patients only 3.0%, while many rare diseases and phenotype combinations remain sparsely represented. SkinMap provides infrastructure to measure blind spots and steer strategic data acquisition toward undercovered regions of clinical space.

12
Detection of Malaria Infection from parasite-free blood smears

Bourriez, N.; Mahanta, S. K.; Svatko, I.; Lacassagne, E.; Atchade, A.; Leonardi, F.; Massougbodji, A.; Cohen, E.; Argy, N.; Cottrell, G.; Genovesio, A.

2026-01-05 health informatics 10.64898/2025.12.29.25343125
Top 1%
19× avg
Show abstract

Malaria affects almost 263 million people worldwide, most of whom live in sub-Saharan countries. In a strategy to reduce malaria-related mortality and limit transmission, diagnosis in endemic areas needs to be immediately available on the field, easy to perform and cheap. Therefore, it currently heavily relies on microscopic examination of blood smears. However, several studies comparing the sensitivity of this approach with qPCR, considered as the most sensitive method albeit not available on the field, found that up to half of the infected population failed to be detected by microscopy alone because no visible parasites could be found in blood smears. These so-called submicroscopic infections pose a diagnostic challenge, yet represent a huge reservoir for malaria transmission. In this study, we hypothesized that qPCR results could be predicted by deep learning from subtle cell signals present in thin blood smear images, even in the absence of visible parasites, making a sensitive diagnostic directly available on the field using a microscope and a smartphone. To test this hypothesis, we acquired a large smartphone-based blood smear images dataset from samples tested both for microscopy and qPCR. We then focused exclusively on these "negative" slides from the microscopic diagnostic point of view, among which half were qPCR positive. A range of standard deep learning models were evaluated to best predict the qPCR result from these microscopy images, using various backbones along with various aggregation functions at the slide level, from a simple vote to Multiple Instance Learning with attention. Our results show that the qPCR results can be predicted from parasite free blood smear images with 62.00% ({+/-}2.5 on 4-folds) accuracy and reaching 67.2 % ({+/-}9.6 on 4-folds) in sensitivity. We then used generative models to investigate the subtle morphological variations occurring in red blood cells that may contribute to predicting malaria infection in the absence of parasites. Leveraging thin blood smear and portable deep learning, we established the first proof of concept that the qPCR sensitivity can be approached through the detection of submicroscopic infections directly on the field without additional infrastructure and thus could significantly improve malaria surveillance and elimination efforts.

13
Glaucoma Detection Using Deep Learning and Prompt-Based Explainable Report Generation

Naqvi, S. A. R.; Ahmed, S. B.

2025-12-15 health informatics 10.64898/2025.12.13.25342203
Top 1%
18× avg
Show abstract

Glaucoma is a leading cause of irreversible blindness and requires early detection to prevent vision loss. This study proposes a novel framework for automated glaucoma detection using fundus images, integrating deep learning and explainable artificial intelligence (XAI). By unifying five public datasets (RIM-ONE, ACRIMA, DRISHTI-GS, REFUGE, and EyePACS), we have created a diverse dataset to enhance model generalizability. An ensemble of five deep learning models, three convolutional neural networks (ResNet50, EfficientNet-B0, DenseNet121) and two transformer-based models (Vision Transformer, Swin Transformer) are trained for robust classification. Grad-CAM and attention rollout visualizations provided insight into model decision making, highlighting critical regions such as the optic disc and cup. These visualizations, combined with ensemble predictions, were processed by Google Gemini 1.5 Flash to generate clinician-style diagnostic reports. The ensemble model has achieved a test accuracy of 95.38% and an AUC of 0.99, outperforming individual models. This framework improves diagnostic accuracy and interpretability, bridging the gap between AI predictions and clinical utility, with potential for future integration into real-world ophthalmic workflows.

14
Clinical validation of automated and multiple manual callosal angle measurement methods in idiopathic normal pressure hydrocephalus

Seo, W.; Jabur Agerberg, S.; Rashid, A.; Holmstrand, N.; Nyholm, D.; Virhammar, J.; Fallmar, D.

2026-02-14 radiology and imaging 10.64898/2026.02.12.26346185
Top 1%
18× avg
Show abstract

IntroductionIdiopathic normal pressure hydrocephalus (iNPH) is a partially reversible neurological disorder in which imaging biomarkers support diagnosis and surgical decision-making. The callosal angle (CA) is one of the most robust radiological markers of iNPH and has also been associated with postoperative shunt outcome. However, several manual measurement variants exist and artificial intelligence (AI)-based tools now enable automatic CA measurement. Materials and MethodsIn total 71 patients (40 with confirmed iNPH and 31 controls) were included. Six predefined manual methods for measuring CA were applied to preoperative 3D T1-weighted MRI and evaluated for diagnostic performance and interobserver agreement. An AI-derived automatic CA (cMRI from Combinostics) was included as a seventh method and compared with the traditional manual method (perpendicular to the bicommissural plane and through the posterior commissure). Automatic measurements were additionally assessed in pre- and postoperative scans to evaluate robustness against shunt-related artifacts. ResultsAll seven CA variants significantly differentiated iNPH patients from controls (p < 0.05). The traditional method showed the highest discriminative performance (AUC = 0.986, SE = 0.012), while alternative planes demonstrated slightly lower accuracy (AUC range = 0.957-0.978). Interobserver agreement for manual measurements was good to excellent (ICC = 0.687-0.977). Automatic CA measurements showed excellent correlation with the traditional method, preoperative ICC = 0.92; postoperative ICC = 0.96. ConclusionAlthough several CA positions perform comparably, the traditional method remains marginally superior and is best supported by the literature. Automated CA measurements closely match expert manual assessment in pre- and postoperative imaging, supporting clinical implementation.

15
Open-Source Offline-Deployable Retrieval-Augmented Large Language Model for Assisting Pancreatic Cancer Staging

Johno, H.; Amakawa, A.; Komaba, A.; Tozuka, R.; Johno, Y.; Sato, J.; Yoshimura, K.; Nakamoto, K.; Ichikawa, S.

2026-01-01 radiology and imaging 10.64898/2025.12.26.25343050
Top 1%
18× avg
Show abstract

PurposeLarge language models (LLMs) are increasingly applied in radiology, but key challenges remain, including data leakage from cloud-based systems, false outputs, and limited reasoning transparency. This study aimed to develop an open-source, offline-deployable retrieval-augmented LLM (RA-LLM) system in which local execution prevents data leakage and retrieval-augmented generation (RAG) improves output accuracy and transparency using reliable external knowledge (REK), demonstrated in pancreatic cancer staging. Materials and MethodsLlama-3.2 11B and Gemma-3 27B were used as local LLMs, and GPT-4o mini served as a cloud-based comparator. The Japanese pancreatic cancer guideline served as REK. Relevant REK excerpts were retrieved to generate retrieval-augmented responses. System performance, including classification accuracy, retrieval metrics, and execution time, was evaluated on 100 simulated pancreatic cancer CT cases, with non-RAG LLMs as baselines. McNemar tests were applied to TNM staging and resectability classification. ResultsRAG improved TNM staging accuracy for all LLMs (GPT-4o mini 61%[-&gt;]90%, p<0.001; Llama-3.2 11B 53%[-&gt;]72%, p<0.001; Gemma-3 27B 59%[-&gt;]87%, p<0.001) and mildly improved resectability classification (72%[-&gt;]84%, p=0.012; 58%[-&gt;]73%, p=0.006; 77%[-&gt;]86%, p=0.093), with Gemma-3 27B showing performance comparable to GPT-4o mini. Retrieval performance was high (context recall = 1; context precision = 0.5-1), and local models ran at speeds comparable to the cloud-based GPT-4o mini. ConclusionWe developed an offline-deployable RA-LLM system for pancreatic cancer staging and publicly released its full source code. RA-LLMs outperformed baseline LLMs, and the offline-capable Gemma-3 27B performed comparably to the widely used cloud-based GPT-4o mini.

16
Uspet: Unsupervised Segmentation Of Pet Images

Jaakkola, M.; Karpijoki, H.; Saari, T.; Rainio, O.; Li, A.; Knuuti, J.; Virtanen, K.; Klen, R.

2025-12-15 health informatics 10.64898/2025.12.15.25342254
Top 1%
18× avg
Show abstract

BackgroundSegmentation is a routine, yet time-consuming and subjective step in the analysis of positron emission tomography (PET) images. Automatic methods to do it have been suggested, but recent method development has focused on supervised approaches. The previously published unsupervised segmentation methods for PET images are outdated for the arising dynamic human total-body PET images now enabled by the evolving scanner technology. MethodsIn this study, we introduce an unsupervised general purpose automatic segmentation method for modern PET images consisting of tens of millions of voxels. We provide its implementation in an easy-to-use format and demonstrate its performance on two datasets of real human total-body images scanned using different radiotracers. Results and conclusionsOur results show that the suggested method can identify functionally distinct areas within the anatomical organs. Combined with anatomical segments obtained from other imaging modalities, this enables great potential to improve clinically meaningful segmentation and reduce time-consuming manual work.

17
External validation of self-supervised transfer learning for noninvasive molecular subtyping of pediatric low-grade glioma using T2-weighted MRI

Yoo, J. J.; Tak, D.; Namdar, K.; Wagner, M. W.; Liu, A.; Tabori, U.; Hawkins, C.; Ertl-Wagner, B. B.; Kann, B. H.; Khalvati, F.

2026-01-30 radiology and imaging 10.64898/2026.01.27.26344883
Top 1%
16× avg
Show abstract

PurposeTo externally evaluate three binary classification models designed to differentiate the molecular subtype of pediatric low-grade glioma (pLGG) between BRAF Fusion, BRAF Mutation, and Wild Type on T2-weighted magnetic resonance imaging using self-supervised transfer learning, which enables effective performance in a low data setting. Materials and methodsThis retrospective study evaluates pLGG molecular subtyping models, pre-trained using data collected at Dana Farber Cancer Institute/Bostons Childrens Hospital, on two datasets from the Hospital for Sick Children, one consisting of patients identified from the electronic health record between January 2000 to December 2018 (n=336) and another consisting of patients identified from the electronic health record between January 2019 to April 2023 (n=87). These datasets consist of T2-weighted MRI with pLGG and corresponding genetic marker identifications, labelled as BRAF Fusion, BRAF Mutation, or Wild Type. The datasets included manually annotated ground-truth segmentations that were used in the classification pipeline during evaluation. The models were evaluated using the area under the receiver operating characteristic curve (AUC). To acquire a per-class probabilities across all three considered molecular subtypes, we used the output probabilities from each binary model as logits input to a Softmax function. These probabilities were used to determine the AUC of the models on each evaluated dataset. ResultsThe models performed achieved a macro-average AUC of 0.7671 on the newer dataset from the Hospital for Sick Children but achieved a lower macro-average AUC of 0.6463 on the older dataset from the Hospital for Sick Children. ConclusionsThe evaluated pLGG molecular subtyping models have the potential for effective generalization but may require further fine-tuning for consistent performance across varying datasets.

18
Advancing Brain Tumor Diagnosis Using Deep Learning: A Systematic Review on Glioma Segmentation and Classification on Multiparametric MRI

Aresta, S.; Palmirotta, C.; Asim, M.; Battista, P.; Cava, C.; Fiore, P.; Santamato, A.; Vitali, P.; Castiglioni, I.; D'Anna, G.; Rundo, L.; Salvatore, C.

2026-01-15 radiology and imaging 10.64898/2026.01.13.26344038
Top 1%
16× avg
Show abstract

Brain tumors are among the most lethal cancers with gliomas representing the most morphologically complex type. Precise and time efficient glioma segmentation and classification are essential for accurate diagnosis, treatment planning, and patient monitoring. Magnetic resonance imaging (MRI) remains the primary imaging modality for noninvasive glioma assessment. This review systematically analyzes deep learning (DL) and artificial intelligence (AI) approaches for brain tumor segmentation and classification. Thirty one studies, out of 310 published between 2022 and 2025, met the inclusion criteria, among which 8 performed both segmentation and classification tasks. For segmentation, most of the studies used publicly available multiparametric MRI datasets. Segmentation performance varied by model and tumor region, with those focused on the whole tumor region achieving the highest Dice Score Coefficient (DSC). Classical U Nets achieved DSC scores around 80%, while advanced models integrating residual or attention modules exceeded 90%. Two main classification tasks were performed: tumor type and glioma staging. Classification models primarily relied on learned features extracted from multiparametric MRI using DL models, reporting an accuracy from 91.3% to 99.4%, with sensitivity and specificity typically above 95%, indicating robust predictive performance. Surprisingly, explainable AI approaches were infrequently applied, highlighting the persistent need for greater model transparency to foster clinical trust. Overall, these results demonstrate the strong potential of current AI based segmentation and classification pipelines. These methods can help clinicians accelerate the decision making process, increasing both the accuracy and efficiency of brain tumor diagnosis. These approaches may also support the development of personalized treatment plans tailored to each patient.

19
CardioPulmoNet: Modeling Cardiopulmonary Dynamics for Histopathological Diagnosis

Pham, T. D.

2026-02-20 health informatics 10.64898/2026.02.19.26346620
Top 1%
15× avg
Show abstract

ObjectiveThis study investigates whether incorporating physiological coupling concepts into neural network design can support stable and interpretable feature learning for histopathological image classification under limited data conditions. MethodsA physiologically inspired architecture, termed CardioPulmoNet, is introduced to model interacting feature streams analogous to pulmonary ventilation and cardiac perfusion. Local and global tissue features are integrated through bidirectional multi-head attention, while a homeostatic regularization term encourages balanced information exchange between streams. The model was evaluated on three histopathological datasets involving oral squamous cell carcinoma, oral submucous fibrosis, and heart failure. In addition to end-to-end training, learned representations were assessed using linear support vector machines to examine feature separability. ResultsCardioPulmoNet achieved performance comparable to several pretrained convolutional neural networks across the evaluated datasets. When combined with a linear classifier, improved classification performance and higher area under the receiver operating characteristic curve were observed, suggesting that the learned feature embeddings are well structured for downstream discrimination. ConclusionThese results indicate that physiologically motivated architectural constraints may contribute to stable and discriminative representation learning in computational pathology, particularly when training data are limited. The proposed framework provides a step toward integrating physiological modeling principles into medical image analysis and may support future development of transferable and interpretable learning systems for histopathological diagnosis.

20
Using Artificial Intelligence to optimize agreement between interstitial sensors and capillary puncture in glycemic assessment and classification

Ecker, L. R.; de Santana, N. A. C.; Caldato, C. F.; Teixeira, C. E.

2026-02-05 endocrinology 10.64898/2026.02.04.26345595
Top 1%
15× avg
Show abstract

IntroductionBlood glucose monitoring is essential for the management of diabetes mellitus. Continuous interstitial glucose (IG) monitoring systems are less invasive than capillary blood glucose (BG) measurements, but their agreement decreases at higher glucose levels. Artificial intelligence (AI) approaches, particularly recurrent neural networks such as long short-term memory (LSTM), have shown potential to model temporal glucose dynamics and correct inter-method discrepancies. Objective: To develop and validate an AI-based model capable of predicting capillary BG values from IG data, improving agreement between methods and enhancing glycemic status classification. Methods: This retrospective observational study analyzed 708 paired BG-IG measurements obtained from published anonymized datasets. Data preprocessing included Kalman filtering, robust normalization, temporal windowing, and class balancing via oversampling. An LSTM model with dual output was trained to perform both capillary glucose regression and glycemic status classification. Model performance was assessed using regression metrics (MAE, RMSE, R2), classification metrics (accuracy, F1-score), and agreement analysis (Bland-Altman). Results: The AI model substantially reduced the mean bias from +16.27 mg/dL to -2.08 mg/dL and achieved markedly narrower limits of agreement compared with raw BG-IG differences (-129.5 to +162.0 mg/dL vs. -47.3 to +43.2 mg/dL). Glycemic classification accuracy was high for hyperglycemia (94.6%), prediabetes (93.7%) and normoglycemia (100%), with lower performance observed for hypoglycemia (66.7%). Conclusion: LSTM-based AI modeling demonstrated strong capability to predict capillary BG from IG measurements and to correct inter-method discordance. These findings support the potential integration of AI-enhanced glucose estimation into clinical monitoring systems to improve therapeutic decision-making.